PredictSNP: Robust and Accurate Consensus Classifier for Prediction of Disease-Related Mutations
نویسندگان
چکیده
Single nucleotide variants represent a prevalent form of genetic variation. Mutations in the coding regions are frequently associated with the development of various genetic diseases. Computational tools for the prediction of the effects of mutations on protein function are very important for analysis of single nucleotide variants and their prioritization for experimental characterization. Many computational tools are already widely employed for this purpose. Unfortunately, their comparison and further improvement is hindered by large overlaps between the training datasets and benchmark datasets, which lead to biased and overly optimistic reported performances. In this study, we have constructed three independent datasets by removing all duplicities, inconsistencies and mutations previously used in the training of evaluated tools. The benchmark dataset containing over 43,000 mutations was employed for the unbiased evaluation of eight established prediction tools: MAPP, nsSNPAnalyzer, PANTHER, PhD-SNP, PolyPhen-1, PolyPhen-2, SIFT and SNAP. The six best performing tools were combined into a consensus classifier PredictSNP, resulting into significantly improved prediction performance, and at the same time returned results for all mutations, confirming that consensus prediction represents an accurate and robust alternative to the predictions delivered by individual tools. A user-friendly web interface enables easy access to all eight prediction tools, the consensus classifier PredictSNP and annotations from the Protein Mutant Database and the UniProt database. The web server and the datasets are freely available to the academic community at http://loschmidt.chemi.muni.cz/predictsnp.
منابع مشابه
The prediction model for cardiovascular disease using Yazd's health study data (YaHS)ta (YaHS)
Introdution: Ischemic heart disease is one of the most common diseases, which has led to high mortality rates all over the world. This disease is caused by narrowing or blockage of coronary arteries, which are the provider of blood to the heart. Identifying the people susceptible to this disease and bringing changes in their lifestyles has been said to reduce the related mortality rates and inc...
متن کاملIntelligent and Robust Genetic Algorithm Based Classifier
The concepts of robust classification and intelligently controlling the search process of genetic algorithm (GA) are introduced and integrated with a conventional genetic classifier for development of a new version of it, which is called Intelligent and Robust GA-classifier (IRGA-classifier). It can efficiently approximate the decision hyperplanes in the feature space. It is shown experime...
متن کاملPrediction of blood cancer using leukemia gene expression data and sparsity-based gene selection methods
Background: DNA microarray is a useful technology that simultaneously assesses the expression of thousands of genes. It can be utilized for the detection of cancer types and cancer biomarkers. This study aimed to predict blood cancer using leukemia gene expression data and a robust ℓ2,p-norm sparsity-based gene selection method. Materials and Methods: In this descriptive study, the microarray ...
متن کاملA Novel Type-2 Adaptive Neuro Fuzzy Inference System Classifier for Modelling Uncertainty in Prediction of Air Pollution Disaster (RESEARCH NOTE)
Type-2 fuzzy set theory is one of the most powerful tools for dealing with the uncertainty and imperfection in dynamic and complex environments. The applications of type-2 fuzzy sets and soft computing methods are rapidly emerging in the ecological fields such as air pollution and weather prediction. The air pollution problem is a major public health problem in many cities of the world. Predict...
متن کاملA Robust Strucutural Fingerprint Restoration
Fast and accurate ridge detection in fingerprints is essential to each AFIS (Automatic Fingerprint Identification System). Smudged furrows and cut ridges in the image of a finger print are major problems in any AFIS. This paper investigates a new online ridge detection method that reduces the complexity and costs associated with the fingerprint identification procedure. The noise in fingerprint...
متن کامل